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ABSTRACT 



Teachers have acknowledged the richer learning environment 
and interactivity of multimedia teaching, its flexibility to different 
learning styles, and learner control that allows the learner to fully engage 
in the learning process. However, they still have problems in courseware 
design because their work is mainly centered on exercises and not on what the 
machine can do best. This is why a new audio segmenting device, called 
Virtual Recorder, has been derived from the LAVAC (Laboratoire Audio- Visuel 
Actif-Comparatif ) toolkit to allow them to use videos. The video sequencer 
can complete real-time automatic segmenting of sound and images and 
automatically insert an answering time span after each sequence . Coupled to 
IBM ViaVoice, the teacher can speak during this time span to create the 
transcript from which the necessary textual help will be derived for easier 
aural comprehension. A 5-minute video requires no more than a few minutes' 
work from the teacher to produce a 2 -hour student session. (Author/MES) 
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Abstract: Teachers have already acknowledged the richer learning environment and 
interactivity of multimedia teaching its flexibility to different learning styles and a learner’s 
control that allows him or her to engage fully in die learning process. 

But they still have problems in courseware design because their woik is mainly centered on 
exercises and not on what the machine can do best This is why a new device has been designed 
to allow diem to use their videos. The video sequencer can complete real-time automatic 
segmenting of sound and images and automatically insert an answering time span after each 
sequence. 

Coupled to IBM ViaVoice, the teacher can speak during this time span to create die transcript 
from which the necessary textual help will be derived for an easier aural comprehension. A 5- 
minute video will then require no more than a few minutes' work from the teacher to produce a 
two-hour student session. 



Introduction 

The aim of IT for a language teacher is to provide interactive simulations of language use through individual 
virtual learning environments. This highlights an important issue: the need for a particular multimedia learning 
j system and courseware that will actually respond to specific user requirements. 

Therefore when the first teacher-controlled multimedia computerized language laboratories applied six years 
ago, the challenge was to enable teachers to use the built-in authoring system without any previous con^iuter 
experience. Several easy-to-use programs with user-fiiendly interface have been developed to help them to 
digitize and edit sound, attadi pictures and sounds to gap-filling or multiple-choice exercises, and achieve a 
multimedia integration that diminishes the weaknesses of each me^a used separately. 

Such systems allow the design and development of a multimedia-based tutoring through embedded tra inin g 
packages and networked communication applications. They should offer learners* siqiport, assessment tools and 
the maximum interactivity between teachers and students whether the teacher is present or not 
But if it is essential to know the possibilities of the system, it is even more important to define as accurately as 
possible the types of learning procedures that need to be iirq)lemented to help the teacher produce his own 
customized courseware and eventually a powerful interface for learning. 
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Objective 

The building-up of knowledge representation and learning will occur through some proposed tasks but these 
tasks should be appropriate to a maximum number of cognitive learner types (analytic, synthetic, kinesic, etc.). 
The problem is then to implement the metalinguistic learning activities that will suit these different cognitive 
types knowing that each learner more or less belongs to most or all of these types. 

Several solutions can be proposed among which a hierarchized list of different deduction techniques for sound 
recognition and understanding, i.e. wave spectrograms, phonetic transcription, lexical hints, written form of 
words and translation. Etymology, knowledge of the discourse situation, contextual logic will be put into use to 
enable the learner to find the meaning of the words or the group of words by him or herself. 

Everything should be designed to help and encourage the learner to carry out his or her tasks alone, with 
imposed hints if necessary for all of them and proposed ones only for those who need them. 

Ke)wQrds in multimedia teaching are learner’s control, hypemavigation, interactivity, and multimodality. But 
learner’s control does not certainly mean no-time limi t to answer nor an easy access to the solution. The problem 
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is tbat most courseware programs leave the learner free to read the solution just after a mouse click or to read the 
transcript on an aural comprehension task^ 

As a costive scientist and a linguist, I have therefore designed the implementation of ten fundamental 
prerequisites for a multimedia language authoring system: 

1. multimediatizing or the use of the multimedia language directly from a didactized content of information the 
teacher will have in mind. Detailed design will be achieved through doing 

2. multimedia expression through a coherent semiological code and a multimedia language still to inroiove and 
stabilize 

3. easy-to-use authoring system for non-progr ammin g teachers 

4. adaptation of the produced software to different cognitive types 

5. adaptation of the produced software to different personality types 

6. set of tots that will implement metalinguistic activities based on deduction techniques, so that the mental 
processes involved can be reused in other linguistic contexts 

7. automatic segmenting with answering time and segment visualization 

8. personalized access to the solution decided by the teacher in relation with the learner’s metacognitive 
performance 

9. presence of the teacher in the room for personalized tutoring while the learners woric on the machines and 
commimication activities in small groiqis still in the same room 

10. possibility of real-time modifications for answering time, textual tots, questions or even mistakes ... on the 
part of the teacher. 



Methodology 

The implementation of these pre-requisites began in 1992 when I designed the LAVAC® (Toma, 1993). This 
acronym stands for "Laboratoire Audio-Visuel Actif-Comparatif, i.e. Audio-Visual Active-Comparative 
Laboratory. Its development is still imder way with an html version, but the decided objectives are now reached 
with the latest version (4.03. i). 

LAVAC has become one of the most popular computerized language laboratories in France since 1993, with 
around 5 000 software programs used in more than 150 university departments and high schools in France and 
abroad. 

The system consists of a complete network of student terminals, plus a courseware-design workstation, all linked 
to a server. It was the first to use a teacher’s console for presential or distance tutoring, which avoids the well- 
known ’wandering* and twisted paths of a learner lost in a traditional Resource Center. 

A LAVAC courseware is in fact a set of segments or sequences with an automatically-given number plus a 
possible name or wording of your choice, linked to sound, images, videos, texts and tutor zones (for proposed 
tots or exercises), making it iqi to six different media altogether. 

This has mainly been design^ for oral comprehension and production, i.e. for listening (with 24 listening 
modes) and recording, but the student can also type in his answers, either in a learning or testing mode, and will 
be guided by hypertext or hypermedia links in case of mistakes. 

The i^blem is that this tool may have been designed too early. Teachers were not ready yet. As Carlson (1998) 
puts it: "A technology-enabled curriculum should be conceptualized as a dynamic partnership among three 
agents: the student, the teacher, and the computer-mediated tools". Six years ago, few students knew how to use 
a (XOTputer and a lot of training was needed Furthermore, the expensive machines (PC 386) were slow in 
playing the wave files which had to be short especially because these files were provided by a distant server 
through a Novell network. The networic nevertheless was the only solution to overcome the low c^>acity of disks 
(540 Mb) and enable discrete or active tutoring. 

The three agents were not present at that time, but now the situation has changed Students have become familiar 
with cheap machines powerful enough to play real-time fidl-screen videos in a 100 Mbit Windows NT network. 
Are teachers a problem? Most observers see them as conservative and technophobic. But this so-called negative 
attitude and the difficulty to educate teachers who, as educators themselves are sometimes the last persons to 
accept to be educated, should not be overemphasized, simply because teachers, and not the industry, are at the 
heart of the system. 

The main problem with teachers is that they see multimedia as a "combination of texts, audio, and pictures on a 
single platform. At its best, it should recombine the benefits of ’conventional* Computer-Assisted Language 
Learning (text reconstruction exercises, tests games, etc.) with those of videos, together with the advantage of 



* ^ Toma, T. (2000).Cognition and Courseware Design by Teachers: the Concept of Multimediatizing. SITE 2000, San 
Diego, Ca, for a more conq>lete description of the present situation. 
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being able to jump instantly to the desired frame rather than having to rely on the rewind or fast-forward keys" 
(Eastment, 1999). Eastment synthesizes the problem and its solution in this comment. 

The problem is to believe that multimedia courseware design just consists in digitizing different media already 
edited in a discrete way and in integrating them with an authoring system, even if as with LAV AC no 
programming is required (for at least 99% of the functions). The question for the teachers is how to integrate the 
different media. A complete rethinking of their pedagogical practices has to be undertaken. Some are ready for 
it, some are not. The need for bringing "ccmventional" teachers into computer-literate ones has well been 
developed in Nierderhauser (1996), but few training courses integrate an epistemological thinking on the 
changes involved in the didactic practices (Toma, 1997). 

Few training courses explain as well that integration should not be earned out in the "ancient" way (still in use 
however) which consisted in giving scripts, audio and video cassettes, exercises on p£^ to team of developers 
that usu^ly, because they are not teachers, have difficulties in understanding the specific teachers’ requirements 
and produce a result often for from the one expected 

This is why I imagined the concept of multimediatizing^ which consists in the direct expression of a didactized 
content into a multimedia form. No storyboard is needed with the LAVAC system since for language teaching 
the privileged medium is SOUND. The automatic segmenting of the sound track will produce numbered 
sequences where text, image, exercises can be attached to if needed 

That is why the solution is in the use of what the machine can do best. Sound processing in LAV AC is real-time 
and the minimum pedagogical work on behalf of the teacher is to write the transcript and analyze the lexical and 
syntactic difficulties for his or her specific students. Help will be given in the corresponding text zones and 
images will be linked to the right sound segment as a prop to the discourse situation. Links are created by a 
simple mouse click. 

Exercises are often difficult to implement with an authoring system. Ironically enough the most profitable ones 
are not the easiest to complete. The first one is note-taking that can be done on a simple sheet of paper or in a 
text zone very easy to create. This phase can be labeled "content appropriation" phase. 

Then in the testing phase exercises will have to be taken and done with the notes (with possible comebacks to the 
informational content). 

The first exercise is a written question one with open answers. An answering text zone has just to be linked to 
the sequence labeled "Exercise 1" and this zone will open when the student will select 'Exercise 1" and click on 
"Record". The written answers are immediately recorded on the server for an instant retrieval in case of 
modifications or analysis by the teacher. 

The second exercise is a transcription of a passage of the sound track. It is just the same as note-taking but all the 
words have to be written. A gap filling-exercise could then be easily set up with a blank for all the words or just 
for some that need to be tested but a transcript on papers that can be given to the teacher proved as effective 
since the point is not to know that a word has been badly understood, but to understand why. Students really 
demand the teacher’s opinion on their problems just because they can see the teacher has more time for 
themselves. 

The third exercise is a so-called "simple" exercise since it is a repetition of a part of the sound track. It is 
automatically implemented by LAVAC since a recordable answering time span is set by default after each 
segment. But this task is not so easy for the learner since he has to discriiniiiate the words, understand them, 
remember them and pronounce them in a limited time 

The fourth exercise consists in oral questions that have to be answered orally. Here the sequences are manuall y 
created by a mouse click, the question is recorded by the teacher and he decides xspon the length of the answering 
time span by entering a number of seconds. 

The fifth is a translation exercise, which is fix>m a computer point of view exactly the same as the first. The 
teacher types the sentence to be translated and creates an answering zone. The students’ written answers will then 
be automatically saved in their respective files. 

Most teachers use this model even if gap- fillin g or multiple-choice questions are also possible. 

However the majority of teachers who discover multimedia teaching, who agree to use multimedia on the 
condition that they produce their own educational software because the available programs on the market are too 
general and will not satisfy their didactic needs, still do not know what to do since they think they will not have 
time to learn how to use even simple tools. 

For this reason, I designed a new system derived fixjm LAVAC that presents a new student interface of the 
LAVAC audio-segmenting device that will avoid teachers to link text or images to the sound segment 
This device, called Virtual Recorder "®, is an audio-sequencer and £^}peared in 1998. The problem was even 
more complex with videos. 



^ See again Toma, T. (2000).Cognjtfo/i and Courseware Design by Teachers: the Concept of Multimediatizmg, SITE 2000, 
San Diego, Ca. 
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Automatic segmenting of the sound AND the images was not easy to achieve for synchronization reasons 
between these ^o media Moreover the sound volume is sometimes kept constant due to music background on 
s^e vidTOs. Since automatic segmenting works by the detection of blanks or volume drops in the sound signal 
this problem of background music had to be overcome. This video-segmenting device or "Video Semencer "® 
appeared then in March 1999. 

Training for teachers is thus shorter. An exercise sheet can even be given to each student at the end of their lab 
session. ExCTCises can be done at home and the correction will take place in a normal classroom. If students 
spend less time in the language lab a greater number of them will be able to use it. 

The methodology described here can therefore fit different didactic contexts. Three software tools can thus be 
used: the complete LA VAC software (authoring system, student interfece, teacher interface (for distance 
tutonng), networking tool program for complete multimedia courseware, the audio-sequencer and the video- 
sequencer both compatible with the LAVAC student interface. 



Results 



produced several educational programs with this methodology by using the complete 
LAVAC toolkit (Toma, 1996). Tests have occurred for six years now with an average of 300 students a year. 
Students aj^eciate the possibiUty of retrieving their recordings (text or sound) more than attractive exercises 
tot are mainly complex to build. They sometimes consider the teacher as a tool when they urgently need him or 
her m case of a problem, but most of the time as a guide and a confident for their own particular problems 
Expeiments are still in progress with the audio and the video sequencer, so I would like to limit the results to the 

if.®,, segmenting device, the use of the video sequencer alone, and the use of this tool with IBM 

ViaVoice and the LAVAC courseware station. 

The LAVAC segmenting device 

The regmenting device needs fi^ to make a distinction between what is language sound and what is not This 
can be ^ on 1-127 scale. A high level will be used if the background noise is important At a value of 30 
segmenting still takes place but the words have to be pronounced loud, if not they might be interpreted as noise. 
A value of 5 is used in a quiet environment Under 5 it has to be very quiet The computer noise may then be 
mterpreted as language and therefore no segmenting will occur since it is continuous! 

The normal values will then vary between 5 (quiet) and 1 5 (rather loud). 

The n^ se^g will concern the value of the blanks to be detected. This setting can vary fiom 0.2 second to 8 
secon . A liard segmenting rate of the sound track at a value of 0.2 second will give a large number of short 
sound segments (one word or more) to suit weak levels of students whereas a medium rate of 0.5 second will 
pve a smaller number of longer sound segments for average-level students. Values above 1 second will be used 
to segment a sound track in large paiagr^hs. 

Another setting concerns the value of the answering time span created after each segment Airing the 
segmentetion process. The length of this span will be proportional to the length of each created segment The 
pr^r^onal value can then be parameterized in a 10 to 999 % range. I usuaUy use a 150 % value for repetitions, 
which rneans for instance that there is a 3-second answering time after a created sound segment of 2 seconds. 
But if tfos answering time is used as an automatic pause in aural comprehension for a better understanding 
pr^ss (broause slower) and for note taking, the value should preferably be set at 3 to 400%. 

Tne digitizing of the audiocassettes of the old cassette labs has also been planned. At a 100% value, the duration 
of foe recorded blanks of foe cassettes (corresponding to foe answering time spans) are respected. But teachers 
find It necessary to dimimsh foe values of foe blanks simply because foe student language level has raised. A 
c^lete rrcc^g of foe cassette would be necessary with a traditional recording system either analog or 
distal. Wrth ms system, foe recording of foe cassette can be done in real-time in foe server with a different 
valm of foe time spans. On top of that if foe new value proved unsatisfactory, it can be changed \^e foe 
students are working on foe networlr. 

The video segmenting device uses these LAVAC parameters except for foe audio cassette settings. 

The student videoseqnencer interface^ 



* ^ f explained more than words these different settings. They cannot be inserted here in a 6 page 

^cle. Neverthelesse these explanations are available with images on my flp she: ftD://130.120 1 19 o/Tnfiv/fim nipgn 
For more details on the system see http://www a1i7i»s.fi~/cD3i 
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The Video Sequencer uses the whole set of didactic prerequisites this time applied to an analog VHS recording a 
an analog or digital camcorder recording or a live satellite program. 

When the video starts to be digitized, it can immediately be seen on each student monitor. Sound is heard but no 
student control is possible. Nevertheless, if a student comes late, he will see and hear the video fix>m the 
beginning and not fix>m the part that is being digitized when he arrives. Or if a weak level student thinks he will 
not have enough time to understand more than 3 minutes of video, he can stop the visualization for a while and 
resume it after, when the first part has been completed 

For a better data transfer throx^ the network, the AVI file sent to each student PC is MJPEG compressed (about 
7 times). 

The interest of the system is that segmenting occurs during this data transfer but is not visible when the video is 
played for the first time. As soon as digitizing is finished, the system automatically turns itself into the 
segmenting mode. 

The segmented video immediately starts on the recording mode. Experience has shown that a learner will tend to 
watch the video intuitively as on television, which may induce a passive attitude. On the contrary, the recording 
mode shows that the machine waits for an action on the part of the learner. The first segment is played with 
sound waves in green on the "teacher" track, and after the first segment a red line appears which is Ae sound 
track of the student when s/he does not speak. As soon as the student speaks, waves ^pear in red. But if he d^s 
not, the video goes on and the second sequence is played The video will thus be automatically paused, which 
will favor a better understanding and note taking. 

Unexpectedly, most students stop the video after three or four sequences have be^i played, just to see the 
"menu" of their work. Each segment is numbered and represented by a square on a line. It is then possible, using 
the direction keys, to move rightward or leftward on this line, or to go straight to the end or the beginning by 
pressing the ^ipr o priate keys. In fact they want to be aware of the average length of each segment since 60 
segments for a 3-minute video will be much easier to listen than 10. 

In each square stnall lines ^pear in different colors according to the status of the segment: yellow for non- 
played segment, green for a played segment, red for a recorded one and green under the red when the listening of 
a recording has been made. 

One of the most used trick is to click inside the sound wave to insert an index fiom which the video starts 
immediately and tirelessly. After each click of the mouse a vertical black line will come up in the yellow line. 
These indexes set by the students will help him find back the segments which posed comprehension problems to 
him He will not need then to jot their number down for an easier retrieval. A recap key will help him play the 
segment at the place of the index when necessary. 

The pedagogical interest of sound waves were questioned at the beginning. Some teachers even saw them as a 
gadget. I even had this dubious attitude. But English pronunciation is so stressed compared to French that this 
first forced students to speak louder in their microphones to make their spectrograms as accentuated as the 
master track's, and second, their could visualize sounds that they would have noticed otherwise. Even when they 
still do not understand it, they can make the difference between what is understood and what is not. 

Three "working" modes have then been in^lemented to increase the range of learning tasks: teachCT then student 
(sequential mode) for recording, teacher and student simultaneously (but in the recording mode, this would mean 
that the student knows the transcript by heart), and role play (in that case, the student can answer fiieely to 
questions asked if s/he takes the role of the interviewee, or know them by heart if s/he takes the role of the 
interviewer. Role play is the favorite activity of the students because they suddenly have the feeling of becoming 
part of the video (at least their voice)! 

Another surprising result concerns the listening of students' recordings in the sequential listening mode. The 
LAVAC setting reproduces the classical model of the language lab: teacher listening, student recording, teacher's 
correction, student's repetition of correction. I chose a different option for the video sequencer. After the 
recording phase, the listening phase starts with the student's recording and not the teacher's one. Tests proved 
that students were much more attentive to the teacher's production (the master track in that instance) after they 
have heard their production than in the opposite v^y. The reason for that still needs further checking, but it 
seems that the student is first eager to hear his or her recording. If he hears the teacher's track first, he will not 
listen to all since he is awaiting his production. When he has heard him or herself, he is more prone to listening 
to the right pronunciation, so that s/he will become more conscious of the distance between both productions and 
will immediately try to diminish it 

The teacher videosequencer with IBM ViaVoice and the LAVAC courseware station. 

After a real time segmenting, the teacher has just to repeat each segmented part of the sound track in the 
following blank created by the system. Words are then written by Via Voice in a special window with a 90% 
accuracy but the silence in the recording room must be total. 
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A LA VAC lesson first been prepared and copied as a model. Hints can then easily be made from the 
transcript and pasted in a window of the corresponding LA VAC sequences. 

Students will therefore be able to type their own transcript using the aids. This transcript can be visualized by the 
teacher through the network and when sufficient work has been accomplished by the student, the correction can 
be sent to him or her. 

All links to a database for vocabulary, grammar, or civilisation purposes can eventually be made, with 
the necessary cormections to the Net. 



Conclusion 

More than 5 000 LA VAC software progr^ are being used in French universities. A number of experimental 
protocols are still in progress mainly carried out by cognitive scientists. The video sequencer seems to be the 
easier and more efficient tool to use for non-programming teachers. 

At least this is perhaps the solution to engage them later in a full courseware design process. 
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